Some non-F0 cues to emotional speech: An experiment with morphing
نویسندگان
چکیده
This paper investigates some non-F0 cues to emotional speech. Two speech samples were collected from spontaneous speech: the word “leave”--one sample spoken with emotion (sad) and the other, as not-emotional. Using the morphing algorithm of STRAIGHT [1], we morphed a series of 12 utterances, starting from the non-emotional “leave” to the emotional “leave”, keeping F0 at 300 Hz. Perception test results show that the morphed speech sounds could be identified as sad, with stimulus 12 being heard as most emotional. The results of a simple correlation, together with a PCA analysis of listeners’ perceptual behavior, suggest that formant frequencies, specifically, lowering F2, F3, and F4 are important cues for perception of emotional (sad) speech.
منابع مشابه
Word segmentation in Persian continuous speech using F0 contour
Word segmentation in continuous speech is a complex cognitive process. Previous research on spoken word segmentation has revealed that in fixed-stress languages, listeners use acoustic cues to stress to de-segment speech into words. It has been further assumed that stress in non-final or non-initial position hinders the demarcative function of this prosodic factor. In Persian, stress is retract...
متن کاملThe Effects of Culture and Gender on the Recognition of Emotional Speech: Evidence from Persian Speakers Living in a Collectivist Society
This paper reports on a behavioral study that explores the role of culture and gender in the recognition of emotional speech in an under investigated cultural context (a collectivist society: i.e., Iran). Participants were asked to recognize the emotional prosody of a set of validated emotional vocal portrayals (including the five basic emotions). Findings of the experiment were then comp...
متن کاملAuditory morphing based on an elastic perceptual distance metric in an interference-free time-frequency representation
An elastic spectral distance measure based on a F0 adaptive pitch synchronous spectral estimation and selective elimination of periodicity interferences, that was developed for a high-quality speech modification procedure STRAIGHT [1], is introduced to provide a basis for auditory morphing. The proposed measure is implemented on a low dimensional piecewise bilinear time-frequency mapping betwee...
متن کاملOn the robustness of overall F0-only modifications to the perception of emotions in speech.
Emotional information in speech is commonly described in terms of prosody features such as F0, duration, and energy. In this paper, the focus is on how F0 characteristics can be used to effectively parametrize emotional quality in speech signals. Using an analysis-by-synthesis approach, F0 mean, range, and shape properties of emotional utterances are systematically modified. The results show th...
متن کامل"Pitch" accent in alaryngeal speech.
Highly proficient alaryngeal speakers are known to convey prosody successfully. The present study investigated whether alaryngeal speakers not selected on grounds of proficiency were able to convey pitch accent (a pitch accent is realized on the word that is in focus, cf. Bolinger, 1958). The participating speakers (10 tracheoesophageal, 9 esophageal, and 10 laryngeal [control] speakers) produc...
متن کامل